AITopics | semantic distance

Collaborating Authors

semantic distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RippleBench: Capturing Ripple Effects Using Existing Knowledge Repositories

Rinberg, Roy, Bhalla, Usha, Shilov, Igor, Calmon, Flavio P., Gandikota, Rohit

arXiv.org Artificial IntelligenceDec-5-2025

Targeted interventions on language models, such as unlearning, debiasing, or model editing, are a central method for refining model behavior and keeping knowledge up to date. While these interventions aim to modify specific information within models (e.g., removing virology content), their effects often propagate to related but unintended areas (e.g., allergies); these side-effects are commonly referred to as the ripple effect. In this work, we present RippleBench-Maker, an automatic tool for generating Q&A datasets that allow for the measurement of ripple effects in any model-editing task. RippleBench-Maker builds on a Wikipedia-based RAG pipeline (WikiRAG) to generate multiple-choice questions at varying semantic distances from the target concept (e.g., the knowledge being unlearned). Using this framework, we construct RippleBench-Bio, a benchmark derived from the WMDP (Weapons of Mass Destruction Paper) dataset, a common unlearning benchmark. We evaluate eight state-of-the-art unlearning methods and find that all exhibit non-trivial accuracy drops on topics increasingly distant from the unlearned knowledge, each with distinct propagation profiles. To support ongoing research, we release our codebase for on-the-fly ripple evaluation, along with the benchmark, RippleBench-Bio.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2512.04144

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.54)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

Interaction Dynamics as a Reward Signal for LLMs

Gooding, Sian, Grefenstette, Edward

arXiv.org Artificial IntelligenceNov-12-2025

The alignment of Large Language Models (LLMs) for multi-turn conversations typically relies on reward signals derived from the content of the text. This approach, however, overlooks a rich, complementary source of signal: the dynamics of the interaction itself. This paper introduces TRACE (Trajectory-based Reward for Agent Collaboration Estimation), a novel reward signal derived from the geometric properties of a dialogue's embedding trajectory--a concept we term 'conversational geometry'. Our central finding is that a reward model trained only on these structural signals achieves a pairwise accuracy (68.20%) comparable to a powerful LLM baseline that analyzes the full transcript (70.04%). Furthermore, a hybrid model combining interaction dynamics with textual analysis achieves the highest performance (80.17%), demonstrating their complementary nature. This work provides strong evidence that for interactive settings, how an agent communicates is as powerful a predictor of success as what it says, offering a new, privacy-preserving framework that not only aligns agents but also serves as a diagnostic tool for understanding the distinct interaction patterns that drive successful collaboration.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.08394

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Leveraging Text-Driven Semantic Variation for Robust OOD Segmentation

Song, Seungheon, Lee, Jaekoo

arXiv.org Artificial IntelligenceNov-11-2025

In autonomous driving and robotics, ensuring road safety and reliable decision-making critically depends on out-of-distribution (OOD) segmentation. While numerous methods have been proposed to detect anomalous objects on the road, leveraging the vision-language space-which provides rich linguistic knowledge-remains an underexplored field. We hypothesize that incorporating these linguistic cues can be especially beneficial in the complex contexts found in real-world autonomous driving scenarios. To this end, we present a novel approach that trains a Text-Driven OOD Segmentation model to learn a semantically diverse set of objects in the vision-language space. Concretely, our approach combines a vision-language model's encoder with a transformer decoder, employs Distance-Based OOD prompts located at varying semantic distances from in-distribution (ID) classes, and utilizes OOD Semantic Augmentation for OOD representations. By aligning visual and textual information, our approach effectively generalizes to unseen objects and provides robust OOD segmentation in diverse driving environments. We conduct extensive experiments on publicly available OOD segmentation datasets such as Fishyscapes, Segment-Me-If-You-Can, and Road Anomaly datasets, demonstrating that our approach achieves state-of-the-art performance across both pixel-level and object-level evaluations. This result underscores the potential of vision-language-based OOD segmentation to bolster the safety and reliability of future autonomous driving systems.

machine learning, natural language, segmentation, (18 more...)

arXiv.org Artificial Intelligence

2511.07238

Genre: Research Report > New Finding (0.88)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.90)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
(2 more...)

Add feedback

S-DAT: A Multilingual, GenAI-Driven Framework for Automated Divergent Thinking Assessment

Haase, Jennifer, Hanel, Paul H. P., Pokutta, Sebastian

arXiv.org Artificial IntelligenceNov-4-2025

This paper introduces S-DAT (Synthetic-Divergent Association Task), a scalable, multilingual framework for automated assessment of divergent thinking (DT) -a core component of human creativity. Traditional creativity assessments are often labor-intensive, language-specific, and reliant on subjective human ratings, limiting their scalability and cross-cultural applicability. In contrast, S-DAT leverages large language models and advanced multilingual embeddings to compute semantic distance -- a language-agnostic proxy for DT. We evaluate S-DAT across eleven diverse languages, including English, Spanish, German, Russian, Hindi, and Japanese (Kanji, Hiragana, Katakana), demonstrating robust and consistent scoring across linguistic contexts. Unlike prior DAT approaches, the S-DAT shows convergent validity with other DT measures and correct discriminant validity with convergent thinking. This cross-linguistic flexibility allows for more inclusive, global-scale creativity research, addressing key limitations of earlier approaches. S-DAT provides a powerful tool for fairer, more comprehensive evaluation of cognitive flexibility in diverse populations and can be freely assessed online: https://sdat.iol.zib.de/.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1609/aies.v8i2.36622

2505.09068

Country:

Europe > Germany (0.28)
Oceania > Australia (0.28)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.67)
Research Report > Promising Solution (0.46)
Research Report > New Finding (0.46)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Rethinking RL Evaluation: Can Benchmarks Truly Reveal Failures of RL Methods?

Chen, Zihan, Zhang, Yiming, Zhou, Hengguang, Ding, Zenghui, Sun, Yining, Hsieh, Cho-Jui

arXiv.org Artificial IntelligenceOct-14-2025

Reinforcement Learning (RL) has emerged as a powerful paradigm for post-training Large Language Models (LLMs), significantly enhancing their capabilities on complex, multi-step reasoning tasks (Ouyang et al., 2022). Methods based on Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO) (Rafailov et al., 2023) have become standard practice for aligning LLMs. These paradigms are often powered by foundational algorithms like Proximal Policy Optimization (PPO) (Schulman et al., 2017), with state-of-the-art variants such as Group Relative Policy Optimization (GRPO) (Shao et al., 2024) pushing models to achieve remarkable performance on benchmarks like GSM8K (Cobbe et al., 2021) and MA TH (Hendrycks et al., 2021). These successes, often marked by state-of-the-art results (Lewkowycz et al., 2022; Lightman et al., 2023), are widely interpreted as a significant leap forward, suggesting that RL-based alignment is a key pathway toward developing more general and robust machine reasoning systems. Despite impressive reported gains, a key question is whether current benchmarks still meaningfully assess generalization. Our findings suggest that the traditional assumption underlying benchmark design, that a model's ability to perform well on unseen test examples is sufficient to measure generalization, no longer holds for RL. We find that RL-based reasoning models trained on the training split achieve nearly the same performance as those trained directly on the test split, indicating that "unseen-ness" alone is no longer the challenging or discriminative criterion. This calls for the rethinking of evaluation: rather than relying solely on disjoint train/test splits, future benchmarks must incorporate settings that remain sensitive to deeper forms of generalization and can reveal weaknesses that simple data separation fails to expose. To systematically investigate this phenomenon, we introduce a multi-faceted empirical framework designed not merely to measure performance, but to deconstruct it.

large language model, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2510.10541

Country:

North America > United States > California (0.28)
Asia (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Creative Preference Optimization

Ismayilzada, Mete, Laverghetta, Antonio Jr., Luchini, Simone A., Patel, Reet, Bosselut, Antoine, van der Plas, Lonneke, Beaty, Roger

arXiv.org Artificial IntelligenceSep-22-2025

While Large Language Models (LLMs) have demonstrated impressive performance across natural language generation tasks, their ability to generate truly creative content-characterized by novelty, diversity, surprise, and quality-remains limited. Existing methods for enhancing LLM creativity often focus narrowly on diversity or specific tasks, failing to address creativity's multifaceted nature in a generalizable way. In this work, we propose Creative Preference Optimization (CrPO), a novel alignment method that injects signals from multiple creativity dimensions into the preference optimization objective in a modular fashion. We train and evaluate creativity-augmented versions of several models using CrPO and MuCE, a new large-scale human preference dataset spanning over 200,000 human-generated responses and ratings from more than 30 psychological creativity assessments. Our models outperform strong baselines, including GPT-4o, on both automated and human evaluations, producing more novel, diverse, and surprising generations while maintaining high output quality. Additional evaluations on NoveltyBench further confirm the generalizability of our approach. Together, our results demonstrate that directly optimizing for creativity within preference frameworks is a promising direction for advancing the creative capabilities of LLMs without compromising output quality.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.14442

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Characterizing Fitness Landscape Structures in Prompt Engineering

Hintze, Arend

arXiv.org Artificial IntelligenceSep-9-2025

While prompt engineering has emerged as a crucial technique for optimizing large language model performance, the underlying optimization landscape remains poorly understood. Current approaches treat prompt optimization as a black-box problem, applying sophisticated search algorithms without characterizing the landscape topology they navigate. We present a systematic analysis of fitness landscape structures in prompt engineering using autocorrelation analysis across semantic embedding spaces. Through experiments on error detection tasks with two distinct prompt generation strategies -- systematic enumeration (1,024 prompts) and novelty-driven diversification (1,000 prompts) -- we reveal fundamentally different landscape topologies. Systematic prompt generation yields smoothly decaying autocorrelation, while diversified generation exhibits non-monotonic patterns with peak correlation at intermediate semantic distances, indicating rugged, hierarchically structured landscapes. Task-specific analysis across 10 error detection categories reveals varying degrees of ruggedness across different error types. Our findings provide an empirical foundation for understanding the complexity of optimization in prompt engineering landscapes.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.05375

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Linguistic Loops and Geometric Invariants as a Way to Pre-Verbal Thought?

Corradetti, Daniele, Marrani, Alessio

arXiv.org Artificial IntelligenceMar-30-2025

In this work we introduce the concepts of linguistic transformation, linguistic loop and semantic deficit. By exploiting Lie group theoretical and geometric techniques, we define invariants that capture the structural properties of a whole linguistic loop. This result introduces new line of research, employing tools from Lie theory and higher-dimensional geometry within language studies. But, even more intriguingly, our study hints to a mathematical characterization of the meta-linguistic or pre-verbal thought, namely of those cognitive structures that precede the language.

artificial intelligence, natural language, transformation, (14 more...)

arXiv.org Artificial Intelligence

2503.23311

Country:

Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Portugal > Faro > Faro (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Africa > Ethiopia > Gambela Region > Gambela (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.69)

Add feedback

AI Can Enhance Creativity in Social Networks

Baten, Raiyan Abdul, Bangash, Ali Sarosh, Veera, Krish, Ghoshal, Gourab, Hoque, Ehsan

arXiv.org Artificial IntelligenceDec-11-2024

Can peer recommendation engines elevate people's creative performances in self-organizing social networks? Answering this question requires resolving challenges in data collection (e.g., tracing inspiration links and psycho-social attributes of nodes) and intervention design (e.g., balancing idea stimulation and redundancy in evolving information environments). We trained a model that predicts people's ideation performances using semantic and network-structural features in an online platform. Using this model, we built SocialMuse, which maximizes people's predicted performances to generate peer recommendations for them. We found treatment networks leveraging SocialMuse outperforming AI-agnostic control networks in several creativity measures. The treatment networks were more decentralized than the control, as SocialMuse increasingly emphasized network-structural features at large network sizes. This decentralization spreads people's inspiration sources, helping inspired ideas stand out better. Our study provides actionable insights into building intelligent systems for elevating creativity.

idea count, marginal distinct idea count, recommendation, (15 more...)

arXiv.org Artificial Intelligence

2410.15264

Country:

North America > United States > New York > Monroe County > Rochester (0.04)
North America > United States > California > Orange County > Orange (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Services (0.86)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Creativity in AI: Progresses and Challenges

Ismayilzada, Mete, Paul, Debjit, Bosselut, Antoine, van der Plas, Lonneke

arXiv.org Artificial IntelligenceDec-9-2024

Creativity is the ability to produce novel, useful, and surprising ideas, and has been widely studied as a crucial aspect of human cognition. Machine creativity on the other hand has been a long-standing challenge. With the rise of advanced generative AI, there has been renewed interest and debate regarding AI's creative capabilities. Therefore, it is imperative to revisit the state of creativity in AI and identify key progresses and remaining challenges. In this work, we survey leading works studying the creative capabilities of AI systems, focusing on creative problem-solving, linguistic, artistic, and scientific creativity. Our review suggests that while the latest AI models are largely capable of producing linguistically and artistically creative outputs such as poems, images, and musical pieces, they struggle with tasks that require creative problem-solving, abstract thinking and compositionality and their generations suffer from a lack of diversity, originality, long-range incoherence and hallucinations. We also discuss key questions concerning copyright and authorship issues with generative models. Furthermore, we highlight the need for a comprehensive evaluation of creativity that is process-driven and considers several dimensions of creativity. Finally, we propose future research directions to improve the creativity of AI outputs, drawing inspiration from cognitive science and psychology.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.17218

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(33 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.92)
Research Report > New Finding (0.67)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Law > Intellectual Property & Technology Law (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback